Personality Mining from Biographical Data with the "Adjectival Marker" Technique

نویسندگان

  • Shivani Poddar
  • VenuMadhav Kattagoni
  • Navjyoti Singh
چکیده

The last decade has witnessed significant work in personality mining from lexical cues in social media data. Not much work has yet been undertaken in extracting these lexical cues from biographical data populating social media. Most of this work involves a large crowd of researchers leveraging dictionary-based approaches such as LIWC (which primarily focus on function words). By means of this paper we intend to introduce a novel method of personality mining from social media data called “Adjectival-marker Technique”. This method involves extracting lexical features from descriptive texts (e.g. biographical data) to train a learning model, so as to predict the respective personality traits of the subject. Conceptually, it draws heavily from the last 78 years of work in lexical psychology and the Big Five personality test. However, it is not only a computational variant of the primordial theories of lexical psychology, but is also competent in conferring a substantial accuracy of personality prediction, matching that obtained by psychometric tests. In this study, we propose a variant of the Lexical Hypothesis from psychology. This modified hypothesis is validated by the computational results of personality prediction achieved by the Adjectival Marker Technique discussed below. The paper also discusses some insights illustrating the coherence of people's judgments about the subject's personality (virtual personality). The average accuracy (i.e. matching that achieved by psychometric tests for Big 5) for prediction approximated to Extraversion 82.82% Agreeableness 89.62%, Conscientiousness 92.48% and Imaginativeness/Intellect 81.67%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introducing an algorithm for use to hide sensitive association rules through perturb technique

Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...

متن کامل

Tolstoy Digital: Mining Biographical Data in Literary Heritage Editions

This paper presents a solution for mining the biographical information from commentaries on Leo Tolstoy’s letters. It is implemented as a part of Tolstoy Digital Project – a semantically marked-up web publication of the 90-volume complete collection of Leo Tolstoy’s works. Extraction of relevant biographical information will be used to create an open database for all the persons who were someho...

متن کامل

Explanation of Relationships between Biographical Characteristics and Entrepreneurship Spirit of Students

Three major causes of the importance of entrepreneurship are making wealth, developing technology and creating productive employment. It is generally believed that a revolution is needed for entrepreneurship to take place in societies nowadays. Thus, the present study aims at investigation of the biographical characteristics and explanation of its relation to entrepreneurial spirit at Mazandara...

متن کامل

Toward Algorithmic Discovery of Biographical Information in Local Gazetteers of Ancient China

Difangzhi (地方志) is a large collection of local gazetteers complied by local governments of China, and the documents provide invaluable information about the host locality. This paper reports the current status of using natural language processing and text mining methods to identify biographical information of government officers so that we can add the information into the China Biographical Dat...

متن کامل

Using Bacillus Cereus as a Geo-Biological Marker For Gold Prospecting in Iran

Several methods have been developed for gold exploration in the past, among which biological base method is known to be the most efficient with least expenses. This method can also be used for latent gold prospects exploration. In the present study, the possibility of applying Bacillus cereus frequency in soil as a biological marker was investigated for the exploration of latent gold prospectin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015